Biostatistics For Dummies (Monika Wahi John Pezzullo)

If you transform your data to get it to assume a normal distribution, any analyses done on it

will need to be “untransformed” to be interpreted. For example, if you have a data set of patients

with different lengths of stay in a hospital, you will likely have skewed data. If you log-transform

these data so that they are normally distributed, then generate statistics (like calculate a mean),

you will need to do an inverse log transformation on the result before you interpret it.

But sometimes your data are not normally distributed, and for whatever reason, you give up on trying

to do a parametric test. Maybe you can’t find a good transformation for your data, or maybe you don’t

want to have to undo the transformation in order to do your interpretation, or maybe you simply have

too small of a sample size to be able to perceive a clear parametric distribution when you make a

histogram. Fortunately, statisticians have developed other tests that you can use that are not based on

the assumption your data are normally distributed, or have any parametric distribution. Unsurprisingly,

these are called nonparametric tests. Most of the common classic parametric tests have nonparametric

counterparts you can use as an alternative. As you may expect, the most widely known and commonly

used nonparametric tests are those that correspond to the most widely known and commonly used

classical tests. Some of these are shown in Table 3-2.

TABLE 3-2 Nonparametric Counterparts of Classic Tests

Classic Parametric Test

Nonparametric Equivalent

One-group or paired Student t test (see Chapter 11) Wilcoxon Signed-Ranks test

Two-group Student t test (see Chapter 11)

Mann-Whitney U test

One-way ANOVA (see Chapter 11)

Kruskal-Wallis test

Pearson Correlation test (see Chapter 15)

Spearman Rank Correlation test

Most nonparametric tests involve first sorting your data values, from lowest to highest, and recording

the rank of each measurement. Ranks are like class ranks in school, where the person with the highest

grade point average (GPA) is ranked number 1, and the person with the next highest GPA is ranked

number 2 and so on. Ranking forces each individual to be separated from the next by one unit of rank.

In data, the lowest value has a rank of 1, the next highest value has a rank of 2, and so on. All

subsequent calculations are done with these ranks rather than with the actual data values. However,

using ranks instead of the actual data loses information, so you should avoid using nonparametric tests

if your data qualify for parametric methods.

Although nonparametric tests don’t assume normality, they do make certain assumptions about your

data. For example, many nonparametric tests assume that you don’t have any tied values in your data

set (in other words, no two participants have exactly the same values). Most parametric tests

incorporate adjustments for the presence of ties, but this weakens the test and makes the results less

exact.

Even in descriptive statistics, the common parameters have nonparametric counterparts.

Although means and standard deviations can be calculated for any set of numbers, they’re most